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ABSTRACT 

A general  duality  theory  is  given  for  smooth  nonconvex  optimization 
problems,  covering  both  the  finite-dimensional  case  and  the  calculus 
of  variations.  The  results  are  quite  similar  to  the  convex  case;  in 

* 

particular,  with  every  problem  (P)  is  associated  a dual  problem  (P  ) 
having  opposite  value.  This  is  done  at  the  expense  of  broadening  the 

framework,  from  smooth  functions  IRn  — IR  to  Lagrangian  submanifolds 

, ^n  n _ 
of  IR  X IR  X F. 
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DUALITY  IN  NONCONVEX  OPTIMIZATION  AND 


CALCULUS  OF  VARIATIONS 
Ivar  Ekeland 


§ 0 . Introduction . 

Duality  methods  are  nowadays  an  important  tool  in  the  study  of 
convex  optimization  problems.  A systematic  treatment  within  the  frame- 
work of  convex  analysis  can  be  found  in  the  books  of  R.  T.  Rockafellar  [ 14] 
and  I.  Ekeland  - R.  Temam  [8].  However,  it  is  easily  forgotten  that 
duality  methods  have  been  in  use  for  quite  a long  time  in  classical 
mechanics,  where  people  are  used  to  stating  a problem  either  in  terms 
of  x-phase  variables  - or  of  p-momentum  variables-,  the  mapping  x — p 
being  the  Legendre  transformation.  A major  difficulty  lies  in  the  fact 
that  the  Legendre  transformation  need  not  be  one-to-one,  except,  of 
course  in  the  convex  case. 

This  paper  aims  to  provide  people  used  to  convex  optimization 
problems  with  a systematic  and  updated  treatment  of  duality  theory  for 
the  smooth  nonconvex  case.  The  first  two  sections  set  up  the  general 
framework.  It  turns  out  that  the  framework  of  functions  is  not  broad  enough 
to  cover  our  needs,  because  the  Legendre  transform  of  a smooth  nonconvex 
function  need  not  be  a function.  So  we  define  Lagrangian  submanifolds 


Sponsored  by  the  United  States  Army  under  Contract  No.  DAAG29-7 5-C-0024. 


of  IK  x]R  xIR  as  the  good  concept  to  work  with,  because  the  Legendre 
transform  of  a Lagrangian  submanifold  is  still  a Lagrangian  submanifold, 
and  because  a Lagrangian  submanifold  comes  very  close  to  being  a 
function  from  IRn  to  IR.  Section  I investigates  the  local  properties 
of  Lagrangian  submanifolds,  and  Section  II  studies  the  Legendre  transform 
in  this  framework. 

The  duality  theorems  then  follow  quite  easily,  either  in  Section  III 
for  the  finite-dimensional  case,  or  in  Section  IV  for  the  calculus  of 
variations.  They  are  exactly  what  one  would  expect  from  the  convex 
case.  References  to  the  bibliography  are  relegated  to  Section  V. 

The  author  wishes  to  acknowledge  long  and  numerous  conversations 
about  this  matter  with  J. -P.  Aubin  and  F.  Clarke,  and  the  expert  typing 
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§ I.  Lagrangian  submanifolds. 

Let  f be  a real-valued  function  on  IRn.  We  can  associate 

with  i the  following  n-dimensional  submanifold  of  IR  * IR  X IR: 

(1.1)  V{  = ((x,  f'(x),  f(x))  | x t IRn)  . 

This  submanifold  has  the  property  of  annihilating  the  differential 
form  u defined  at  any  point  (x,  p,  z)  of  Rn  x IRn  x IR  by  the  formula 


(1.2) 


1 1 

w = dz  - /.  p.dx.  . 

.1  i 


i = l 


V 9f 

Indeed,  the  restriction  of  w to  V reduces  to  df  - ^ gx  dxj 

i = l i 

which  is  identically  zero.  This  motivates  the  following  definitions: 
Definition  1.1.  A Lagrangian  submanifold  of  !Rn  x ]Rn  x IR  is  a closed 


n-dimensional  C -submanifold  V such  that 

(1.  3)  iy  w = 0 


where  iy  : V - IRn  x lRn  x IR  is  the  canonical  injection  and 
iy  : T*(IRn  x lRn  x IR)  - T V the  induced  map  of  differential  1-forms.  We 
shall  say  that  x t lRn  is  a critical  point  of  V and  that  z c IR  is 
a critical  value  whenever: 

(1.  4)  (x,  0,  z)  t V . 

We  shall  associate  with  V a multivalued  mapping  Fy  from  IRn  to  IR: 

(1.  5)  Fy(x)  = {z  I 3p  < IRn  : (x,  p,  z)  « V} 

and  Crill  it  the  characteristic  map  of  V. 


t 
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are  the  critical 


In  both  cases,  the  critical  points/values  of  V 

points/values  of  f,  and  the  characteristic  map  Fy  of  Vf  coincides 
with  f: 

(L8)  V*  * Kn,  Fv(x)  = (f(x) } . 

We  now  seek  a partial  converse:  describe,  at  least  locally,  a 
given  Lagrangian  submanifold  V,  in  terms  of  a smooth  function  f : IRn  -* 

For  that  purpose,  we  introduce  the  set  « of  points  x c IRn  such  that 

❖ # 

the  1-forms  iydXj,  . . . , iydxn  are  linearly  independent  at  every  point 
(x,  p,  z)  of  V projecting  on  x. 

Proposition  1.  2.  The  subset  lRn\P  has  Lebesgue  measure  zero  in  ]Rn. 

For  every  point  x t P there  exist  a (possibly  empty)  countable  set  of 
indices  A,  a family  ’Ua,a  t A,  of  neighborhoods  of  x in  IRn  a 
family  f^  : - ]R  of  smooth  functions,  such  that: 

l1-9)  n~\~x)  C U 1r  C V 

x . a 

at  A 

where: 

(1*10>  {(x,H(x),fJx))|x  * Ka,a  t A)  . 

Note  that  (1.9)  implies  that  Fy(x)  = (fQ(x)U  < A}.  Intuitively, 
the  part  of  Fy  lying  above  x is  decomposed  into  smooth  branches 

f a’  ° ‘ A’  with  za  " and  Pa  ~ fVx)-  Two  branches  may  intersect, 

but  they  must  do  so  transversally:  if  fjx)  = f^(x)  with  a * 

then  f'(x)  * f'(x). 
a (3 
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Wmhm 


Proof  of  prop.  1.2.  To  say  that  the  1-forms  ivdx1»  • • • » ivdxn  are 

linearly  independent  at  (x,  p,  z)  t V means  that  (x,  p,  z)  is  a regular 

point  for  the  projection  : V -*  ]Rn.  The  set  lRn  - w is  just  the  set 

of  critical  values  for  n , and  it  follows  from  Sard's  theorem  that  it 

x 

has  measure  zero. 

Take  x t <?,  and  let  {(x,  p , z ) ( o c A}  be  the  (possibly  empty) 

a a 

set  of  points  of  V projecting  on  x.  By  the  definition  of  9 , each 

(x,  p , z ),  a t A,  is  a regular  point  for  rr^.  By  the  implicit  function 

theorem,  there  are  neighborhoods  V of  x and  1r  of  (x,  p , z ) 

at  a a a 

such  that  it  —V  is  a diffeomorphism.  In  other  words,  there  are 

x a cf 

real-valued  C functions  f and  g 1 < i < n,  defined  over  V , 

a al  ~ ~ a 

such  that: 


(1.11) 


(x,  p,  z)  < 1r  <=>  {x  c V , z = f (x),  p = g (x) } 
a a a l al 


The  vanishing  of  i^uj  means  that: 


(1.12) 


df  - ),  g (x)dx  = 0 over  l(  , 
a all  a 


which  yields: 


(1.13) 


g = 77 Vx  « V • 

al  ox.  a 

i 


Writing  (1.13)  into  (1.11),  we  get  formula  (1.10),  with  formula  (1.9)  being 
satisfied  by  construction.  It  only  remains  to  prove  that  the  set  A is 
at  most  countable.  For  this,  notice  that: 


i: 


(1.14)  tt~\x)  n If  = {(x,  p , z )} 

' ' a a a 

and  hence  that  a * (3  =>  (x,  p_,  zj  / 1r  ■ This  shows  that  all  points  in 

P P <* 

tt~\x)  are  isolated,  hence  any  compact  subset  of  V can  contain  only 

a finite  number  of  them.  As  V is  a closed  subset  of  1R  , it  can 

be  written  as  a countable  union  of  compact  subsets,  and  the  result  follows.  ■ 

In  the  special  case  where  the  map  is  proper  at  x,  it  is  easily 

seen  that  the  set  A has  to  be  finite.  Setting  y = 1i  , we  get 

» a 
at  A 

the  following  corollary: 

Corollary  1.3.  Assume  moreover  the  map  rr  is  proper.  Then  <?  is 
open  in  IRn,  and  for  every  point  x e ft  there  is  a neighborhood  y 
of  x and  a (possibly  empty)  finite  family  of  smooth  functions  f : y — JR, 
a t A,  such  that: 

(1.15)  ^ {(x,  f^(x),  fa(x))  I x « y,  a t A}  . 

a t A 

We  now  have  a description  of  it  \x)  which  is  valid  whenever 
x « W , i.e.  for  almost  every  point  x « IRn.  Points  in  IRn\fl  form  a 
negligible  subset,  but  they  may  nevertheless  turn  out  to  be  important, 
so  we  will  attempt  a partial  description  in  that  case  also. 

Proposition  1.4.  Let  t (x(t),  p(t),  z(t ))  be  a C*  map  from  ] 0 , T] 
into  V such  that  x(t)  e 9 Vt  > 0.  Assume  that,  when  t -*  0: 

— H v 

(1.16)  x(t)  - x and  ~ (t)  - £ 

(1.17)  z(t)  - z 

(1.18)  lim  inf  ||p(t)  - p II  = 0 , 
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with 

(X,  p,  z) 

an  isolated  point  of  n *(x,  z).  Then: 

xz 

(1.19) 

p(t)  - p 

(1.  20) 

f(t)-p  - e. 

Proof- 

As  p 

is  an  isolated  point  in  1TX^X>  z),  there  is  a compact 

neighborhood 

> of  (x,  p,  z)  in  V such  that: 

(1-  21)  (x,  p,  z)  t > =>  p = p . 

Assume  p(t)  does  not  converge  to  p.  Then  there  is  an  open 
neighborhood  ?/•  of  (x,  p,  z),  contained  in  and  a sequence  t 

n 

such  that: 

(!•  22)  (x(tn),  P(*n),  “ * • 

Using  (1.16)  and  (1.17),  together  with  the  fact  that  > - V is 
compact,  we  can  extract  a subsequence  con1'  rging  to  some  point 
(1-  23)  (x,  p',  z)  » > - V 

contradicting  (1.21). 

So  p(t)  has  to  converge  to  p,  yielding  (1. 20).  Setting  z(0) 
we  define  a continuous  real-valued  function  t *-  z(t)  on  (0,T).  It 
follows  from  Proposition  1.2  and  the  fact  that  x(t)  c » for  t > 0 that 
this  function  is  derivable  on  JO,  T)  with  derivative: 

(1.24)  Jf(t).p(t)^(t). 

When  t - 0,  the  right-hand  side  converges  to  p • £,  and  so  does 
the  left-hand  side.  ■ 


8- 


Note  that  (t)  need  not  converge.  Note  also  that  (1.16)  and 

,+  .+  _ ,+ 

(1.20)  imply  that  (0)  = i and  (0)  = p • with  ~ denoting 

the  right-derivative.  Equation  (1.  20)  can  be  written: 

.+ 


(1.  25) 


(°)  = ; ■ ^ to 


which  expresses  the  vanishing  of  dz  - pdx  above  a point  x not  in  ft. 

Let  us  give  a more  accurate  picture  in  a simple  case: 

Proposition  1.5.  Assume  tt_  is  proper  and  tt  /(x)  is  finite.  Let  a 
simply  connected  subset  i2  of  ft  be  given  in  the  following  way: 

(1.  26)  f2  = {x  + t£  | 0 < t < a,  £ * S} 

2 2 

with  S an  open  subset  of  the  unit  sphere  £,+■••+£  =1.  There  is 

1 n 

a (possibly  empty)  finite  family  of  functions  f : 12  U (x  } - F,  a « A . 

ur 

such  that: 

(1.27)  n *(Q  U {x})  = {(x,  f (x),  f (x))  J x i n U {x},  a t A}  . 

a a 


By  a derivative  of  f at  x we  mean  a linear  functional  f'  (x) 
a a 


such  that: 


(1.28) 


Ye  > 0,  3r)  > 0 : 11  x - x ||  < g and  x t n 

— > If  (x)  - f (x)  - (f*  (x),x  - x>|  < e II x - x ||  . 

a a a 

By  a C*  function  on  S2  U { x } we  mean  a function  f such  that 


f'(x)  is  well-defined  and  continuous  on  {x}  U i2. 
a 

Proof  of  Proposition  1.  5.  The  set  tt^x)  has  to  be  both  compact  (because 
it  is  proper)  and  discrete  (because  x t ft),  so  it  is  finite.  By 

X 


HHmMI 


• • Jv  - . -W;.  .,  - V. 
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Proposition  1.2,  the  map  tt^  : (Q)  -■*  is  a covering.  As  i2 

simply  connected,  the  restriction  of  to  each  connected  component 

"x  *s  a diffeomorphism,  hence  the  representation  formula: 

U-  29)  it  *(0)  = {(x,  f'  (x),  f (x))  | x t S2,  a c A}  . 

a a 

Now  fix  at  A and  let  x converge  to  x in  !1.  As  r is 

x 

proper,  (x,  f'  (x),  f (x))  has  cluster  points  (x,  p,  z)  t it  ^(x).  As  this 

u ci  X 

set  is  finite,  all  its  points  are  isolated.  As  in  the  preceding  proof,  we 

conclude  that  f'(x)  - p and  f (x)  - z . Setting  f (x)  - z and 
a a a a a a 

f' (x)  = p , we  get  a C function  as  desired.  ■ 
a a 

Let  us  conclude  this  investigation  of  Lagrangian  submanifolds  by 

the  following  remark,  which  throws  some  light  on  the  case  where  ir  (x) 

x 

is  not  discrete.  Let  t -*  (x(t),  p(t),  z(t))  be  a C1  path  drawn  on  V 
along  which  x(t)  is  constant:  x(t)  = x,  0 < t < T.  Then  z(t)  has  to 
be  constant  also:  z(t)  = z,  0 < t < T,  so  in  fact  only  p(t)  varies.  This 

} {c 

follows  easily  from  the  vanishing  of  i w,  which  yields  in  this  case 
dz  n d x. 

^ ~ Lj  Pj  j-jt  (*)•  1°  particular,  if  Ir  is  an  open  path-connected 

i = l 

subset  of  V projecting  on  x,  i.e.  ir  C iT^x),  then  r is  also 
contained  in  some  hyperplane  H = {(x,  p,  z)|x  = x,  z = z}  as  an  open 
path-connected  subset  (openness  follows  from  the  fact  that  dim  V = n = dim  H). 
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§11.  The  Legendre  transformation . 


i 


■ 


(2.1) 


The  mapping  £ of  D?n  x ]Rn  x R into  itself  defined  by: 

£(x,  p,  z)  = (x'.p*,  z') 
x1  = p,  p'  = x,  z'  = px  - z 
is  called  the  Legendre  transformation.  Note  that: 

Proposition  2.1.  The  Legendre  transformation  is  a C involution: 


(2.2) 


£ = Id  . 


Proof.  Using  notations  (2. 1),  we  set  £(x',  p',  z')  = (x",  p",  z"),  with 

x"  = p'  = x 

p"  = X1  = p 

z"  = p'x'  - z1  = px  - (px  - z)  = z 
hence  the  result.  ■ 

The  fundamental  fact  about  the  Legendre  transformation  is  that  it 

preserves  the  1-form  w,  up  to  a change  of  sign: 

* 

Theorem  2._2.  £ i*>  = -u  . 

Proof.  Using  notations  (2.1),  we  g^t: 

if, 

£ w = dz'  - p'dx' 

= (xdp  + pdx  - dz)  - xdp 
= pdx  - dz 
= -w  . ■ 

Corollary  2.  3 . If  V is  a Lagrangian  submanifold  of  !Rn  x ]Rn  x IR, 
so  is  £ V. 


| 

4m 


then 
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Prop ( ■ It  follows  from  Proposition  2.1  that  X is  a diffeomorphism  of 


N 


Fn  x IRn  x F onto  itself.  Hence  XV  is  a closed  submanifold  whenever 


V is.  There  only  remains  to  check  that  i^-j  r To  do  that,  we 


write  the  following  diagram: 


V — ►IR0  x Fn  X F 


(2.  3) 

1 

S 

' • ^ 

/ 1 w II 

where  l is  the  restriction  of  X to  V and  j is  the  canonical 
injection.  This  diagram  commutes,  and  gives  rise  to  another  commutative 
diagram  relating  1-forms: 


T V*— ^ — T (JRn  X lRn  X IR) 


2.4) 


.n  ..  mn 


T (XV)V T (IR  x IR  x F)  . 

Taking  w in  the  lower  right-hand  comer,  and  using  formula  (1.  3) 
and  Theorem  2.  2,  we  get: 


i • X (u>)  = i (-u>)  - -i  (u)  = 0 


(2.  5) 

going  the  other  way  around  the  diagram,  we  get 


(2.6) 


$ & 

0=1  •)(«)• 


As  t is  a diffeomorphism,  / is  an  isomorphism,  and  Equation  (2.  6) 


implies  that  j w = 0,  i.e.  XV  is  Lagrangian.  • 

We  now  introduce  a slight  misuse  of  notations.  Let  V and  W 


be  Lagrangian  submanifolds  of  lRn  x IRn  x F,  with  W = XV,  and  let  F 
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dm 


and  be  the  associated  characteristic  maps.  We  shall  write  freely 

l’w  £Fy  a°d  ca^  t*le  Legendre  transform  of  Fy  For  instance, 

n jc 

if  f : IR  - IR  is  a C function,  then  f f is  the  multivalued  map 
from  IRn  to  IR  defined  by: 

(2.7)  f f(x')  = (z1  I p'tIRn:(x',p',z')«j!V^}. 

Using  (1.1)  and  (2.1),  we  get: 

(2.  8)  £f(p)  = {px  - f(x)  | f'(x)  = p)  . 

Several  remarks  are  now  in  order.  First  of  all,  if  f,  in  addition 
to  being  smooth,  is  convex,  then  the  function  x *-»  px  - f(x)  is  concave, 

and  the  equation  p = f'(x)  simply  means  that  this  function  attains  its 

i 

maximum  at  x.  Equation  (2.8)  then  becomes: 

(2.9)  i*  f ( p)  = max{px  - f(x)|x<  IRn  } . 

Formula  (2.9)  shows  that  £f  is  single-  or  possibly  empty-valued. 

In  other  words,  £f  is  a real-valued  function  defined  on  some  subset 
of  IRn.  It  is  to  be  compared  with  the  classical  Fenchel  transform  of 
convex  analysis: 

(2.10)  f (p)  = sup{px  - f(x)  | x t !Rn}  . 

Formulas  (2.9)  and  (2.10)  coincide  whenever  the  function  x -*  px  - f(x) 

n * 

attains  its  maximum  over  IR  . Define  the  effective  domain  of  f as 

the  set  of  points  where  it  is  finite: 

(2.11)  dom  f ={p|f(p)<uo}. 
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9^C  jJ* 

Proposition  2.  3 . ff(p)  - f (p)  if  and  only  if  f is  subdifferentiable  at  p, 

$ V 

i.e.  df  (p)  # <>.  This  is  the  case  at  every  interior  point  p of  dom  f : 

<2. 12)  p « dom  f =>  S f(p)  - f (p)  • 

* 

Proof.  Let  us  write  down  the  definition  of  the  subdifferential  of  f : 

♦ — * n i — , 

(2.13)  df  (p)  - (x  * 1R  I px  - f (x)  = max  ) 

x 

where  the  notation  max  means  that  the  left-hand  side  attains  its 
_ x 

maximum  at  x.  But,  as  f is  continuous  and  convex,  it  coincides  with 
its  biconjugate  f , hence 

(2.14)  9f*(p)  = {x  « IRn  I px  - f(x)  = max) 

x 

which  proves  the  first  part  of  the  proposition. 

It  is  a well-known  fact  from  convex  analysis  that  any  convex  func- 
tion on  a Banach  space  is  continuous,  and  hence  subdifferentiable,  on 
the  interior  of  its  effective  domain.  Hence  (2.12).  • 

In  the  general  (smooth,  nonconvex)  case,  formula  (2.8)  sets  £f(p) 
in  one-to-one  correspondence  with  the  sets  of  tangents  to  f having  slope  p: 
Proposition  2.4.  z'  t ff(p,  if  and  only  if  z = px  - z'  is  a tangent 
hyperplane  to  graph  f in  IRn  x IR. 

Proof.  The  hyperplane  z = px  - z'  in  (x,  z)  - space  is  tangent  to 
graph  f if  and  only  if  there  exists  x « lRn  such  that  f*(x)  = p and 
f(x)  = px  - z'.  This  reduces  to  z'  « £f(p)  by  Equation  (2.8).  ■ 


from  Proposition  2.-1  one  sees  instantly  that  £f  can  be  multivalued. 
Indeed  £ f is  a function,  i.e.  £f(p)  is  empty  or  a singleton  for 
every  p,  if  and  only  if  f has  only  zero  or  one  tangent  of  prescribed 
slope.  In  dimension  n = 1,  this  means  exactly  that  f is  strictly 
convex.  In  higher  dimensions,  this  also  happens  in  the  nonconvex  case: 
take  for  instance  f(x^,  x^)  = x^  - x^,  then  f : (x^,x^)  t-  (2x^, -2x^)  is 
one-to-one.  But  the  fact  remains  that,  in  contrast  with  the  convex  case, 
in  the  general  case  we  have  to  deal  with  multivalued  Legendre  transforms. 
So  let  us  attempt  a description  of  £f.  We  denote  by  V the  Lagrangian 
submanifold  (1. 1)  of  ]Rn  x ]Rn  x 1R  associated  with  f,  and  by  A(x) 
the  matrix  of  second  derivatives  of  f at  x: 


(2.15) 


A(x)  = 


3Zf 

rix.dx 
1 ) 


(X) 


1 < i,  i < n 


Proposition  7 . 5.  Assume  A(x)  has  full  rank  n.  Then  there  exists  a 
neighborhood  1r  of  (f'(x),  x,  xf'(x)  - f(x))  in  V projecting  onto  a 
neighborhood  V of  f '( x)  in  IP  , and  a local  inverse  for  f1  such  that. 

(2.16)  r = {(p,lJy£]'(p), U,rf](p))l  p « V } 


with  [ £ f 1 (p)  = p</>(p)  - f 
(2.17) 


<p( p).  In  particular,  we  have: 
[£?ff]  (p)  = x . 


Proof.  It  follows  from  the  implicit  function  theorem  that  the  map  x f'(x) 
has  a local  inverse  <p  defined  on  some  neighborhood  V of  p.  Setting: 
(2.18)  If  - {(f'(x),  x,  xf'(x)  - f(x))  I x « <p(h  ) } 
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and  using  the  definition  of  v,  wc  get: 

U- 19)  r - {(p,  «»(p)f  p«>(p)  - f o <p(p))  I p * v } . 

Computing  the  derivative  of  S f,  we  get: 

1r 

( 2.  20)  [ X^f](p)  = </>(p)  + V(p)p  - (p)  f'  • <p(p) 

if 

t ' t ' 

= v(p)  + <p  (p)p  - <p  (p)p 
= <P(  p) 

an  i formula  (2.14)  reduces  to  (2.16).  ■ 

,T„.f  is  a smooth  branch  of  St  lying  above  p.  Note  that  p is 
a regular  value  for  f'  : IPn  - IRn  if  and  only  if  it  is  a regular  value  for 
rr ^ : j" V -»  Ipn . This  is  almost  always  the  case,  by  Sard's  theorem,  and 
the  part  of  J f lying  above  p then  is  a countable  union  of  smooth 
branches  such  as  i,.f  (this  is  a particular  case  of  Proposition  1.2).  If 
moreover  P is  proper  at  p,  then  so  is  tt^,  and  there  is  only  a finite 
number  of  branches  of  St  lying  above  p (this  is  a particular  case  of 
Corollary  1.3). 

We  can  of  course  apply  Propositions  1. 4 and  1.  5 to  get  a description 
of  St  above  critical  values  of  f'.  But,  in  this  particular  case,  we 
prefer  another  approach,  which  has  the  advantage  of  directly  relating  the 
shape  of  the  Legendre  transform  above  f(x)  to  the  degeneracy  of  the 
matrix  of  second  derivatives  at  x.  We  write  the  Taylor  expansion  of 
f at  x: 

(2.21)  f(x  + i)  = f(x)  + pi  <A(x)£,4>  +£*3(x!t1 £n>  + 0(  U I4) 

where  P^(x;-)  is  a homogeneous  polynomial  of  degree  3 in  n variables. 
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Using  the  Euler  formula,  we  may  write: 


i n - 

1 V e 


p5,X!h’--V  * 7 <„>  • l VBi(x)5’*> 

i - 1 i i = 1 


where  B lx)  is  the  matrix  with  elements  ~ 3 f/3x.3x.3x,  , 1 < j,  k < n. 
• 3 i j k — — 

Denote  by  < B(x)£,  £)  the  n-vector  with  components  (B  (x)£,£>  . 
Proposition  2.6.  Assume  that  A(x)  has  rank  (n  - 1)  and  that: 


2.  22) 


i < Ker  A(x)  => 


P3(x:41,  ...,6n)  * o 

(B(x)$,  i)  / Im  A(x)  . 


Then  (possibly  after  reordering  the  linear  coordinates  (p^,  . . . , p^) 

in  IRn  and  changing  p to  -p  ) there  is  a neighborhood  ?/  of 

n n 

(f'(x),  x,  f'(x)x  - f(x))  in  XV,  a neighborhood  T<  = V 1 x of 

n 

(P1’  ‘ ' •,Pn-rPn)  in  IRn>  C functions  *V  : ^ ^ and  h : 'U  — K, 

such  that  tt  ir  is  completely  described  by  the  set  of  conditions: 

X z 

U.Z3)  lPl,...,pn  l,pn)  , V xvn  and  P^kjlPj Pn_,l 

(2.24)  zt  {z+(p),  z_(p) },  with 

f ^,P,  ‘ k2,pl Pn-1>  * ,pn  ' Vh(Pl pn-l''/p7rki> 

Vz-<p)  = k2<pl pn-l>  * (pn  - kl,h<pl pn-l’Wpn  ' ki>  • 

Moreover  3z/3p.  - x.,  1 < i < n,  along  the  hypersurface 

pn  = kl(Pl »„-!>• 


Proof.  The  (x  , . . .,  x ) are  a system  of  coordinates  in  XV,  formula 
l n 

(2.  8)  yielding  (p,  ...,p  , z)  in  terms  of  (x  , ...,x  ).  In  particular: 
in  in 


(2.  26) 


T — x)  = p.  for  1 < t < n . 

9x.  i — — 

l 

The  rank  assumption  on  the  matrix  A(x)  implies  that  one  of  its 

(n  - 1)  x(n  - 1)  minors  is  invertible,  for  instance  the  one  defined  by 

the  (n  - 1)  first  rows  and  the  (n  - 1)  first  columns.  Moreover,  the  n^ 

row  then  is  a linear  combination  of  the  (n  - 1 ) first  rows. 

It  follows  from  the  implicit  function  theorem  that  the  (n  - 1)  first 

equations  of  system  (2.26)  can  be  solved  locally  for  (x^,  . . . , x^  ^). 

In  other  words,  (p.,  . . . , p ,,  x ) can  be  used  as  coordinates  in  some 

1 n-1  n 

neighborhood  ir.  of  (p,  x,  z)  in  XV(^).  Now  consider  the  path 

w(t)  = (p(t),  x(t),  z(t))  in  rQ  such  that  p^t)  = pi»  ' • • » pn-i(^  = pn-l’ 

x (t)  = x + t.  There  is  some  T > 0 such  that  w(t)  is  well-defined 
n n 

for  -T  < t < T.  Obviously  w(0)  = (p,x,  px  - f(x));  we  shall  write  £' 

for  ~ (0)  and  4"  for  (0).  Equations  (2. 26)  are  satisfied  along  w(t): 

dt 

(2.27)  pAt)  = (x.(t) x (t))  for  0<t<T. 

i ox,  i n 

i 

Writing  Taylor  expansions  tnto  (2.2 7),  we  get: 

_ _ 2 _ 

(2.28)  p(t)  - p = tA(x)4'  +y  UB(x)£',£'>  + A(x)|"l  + 0(t  ) . 

But  p^t)  - p,  = 0 for  1 < i < n - 1,  so  that  both  sides  of  the  (n-1) 

first  equations  of  system  (2.28)  are  identically  zero  on  (-T,  T).  It 
follows  that  the  (n  - 1)  first  components  of  A(x)4  are  zero,  and,  by 

^From  now  on  we  set  p = f'(x)  and  z = px  - f(x). 
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r 


the  rank  assumption,  so  is  the  last  one: 


(2.29)  A(x)4'  = 0 . 

Assumption  (2.22)  then  yields: 

(2.30)  <B(x)|»,  i>)  + A(x)|”  * 0 . 

But  again,  both  sides  of  the  (n  - 1)  first  Equations  (2.28)  being 
identically  zero  on  (-T,  T),  the  (n  - 1)  first  components  of  vector  (2.  30) 
must  be  zero.  It  follows  that  the  n^  component  must  be  nonzero. 

We  summarize  our  results  so  far  by  stating  that  the  nth  equation  of 
system  (2.28)  can  be  written  as: 


(2.  31) 


p„(t)  -pn=ia/t0,ti)'  an’°- 


Similarly,  we  compute  the  Taylor  expansion  of  z(t)  at  t = 0. 
By  definition,  we  have: 

(2-  32)  z(t)  = f'[x(t)]x(t)  - f(x(t)  ] . 

Successive  derivations  yield: 


(2.  33) 
(2.  34) 


^f(O)  = <A(x)£\£'> 


d2z 

dt2 


(0)  = 2<A(x)£,,£">  + P (x;|J 4')  . 

i l n 


“ d z 

But  we  have  seen  that  A(x)£'  = 0,  so  that  — (0)  = 0 and 

dt 

d2Z 

•^—(0)  = / 0 by  assumption  (2.  22).  Finally  we  get: 


(2.  35) 


z(t)  - z = “ b t2  + 0(t?),  b * 0 . 
z n n 

9 


Now  w'(0)  is  just  the  tangent  vector  ~ — (p,,...,p  ,,x  ) 

ok  1 n-1  n 

n 

associated  with  the  new  coordinate  systems.  In  other  words,  p and  z. 

n 
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W 


considered  as  functions  of  fp,,...,p  ,,x  ) in  y, , satisfy: 

I n-1  n 1 


(2.  36) 

3pn  - 

ax  <pi---' 
n 

o 

II 

1 xC 
1 aC 

(2.  37) 

2 

9 P _ 
n , 

2 <Pl" 
9X 

n 

••'pn-l'xn,'° 

(2.  38) 

ax"  <pr  • • 
n 

o 

II 

1 XG 
1 o? 

(2.  39) 

2 

-2-^  (p  • • 
2 'P1  ’ 

9x 

■•Vi’V*0  • 

But  other  points  (p,  x,  z)  in  enjoy  the  property  that  A(x)  is 
of  rank  (n  - 1)  and  satisfies  (2.22).  Indeed,  consider  the  Jacobian 
determinant: 


(2.  40) 


D(p.  > • • ■ > p_  ,,pJ  - 
a(pl Pn-l’V-5(p,...;p"  ,*“)  <> 

1 n-1  n 


= 5T(P1 p„-l'xn> 

n 

by  a simple  computation.  Clearly  rank  A(Xj,  . . . , x^)  < n if  and  only  if 

..  82P 

A(p.,  . . . , p x ) = 0.  But  A = 0 and  ~ — = r * 0 at  point 

l n-i  n ox  . 2 

n 9x 

n 


> «[i •*, 

recall  that  — r denotes  the  determinant  ~ — 

D(x  , . . . , x ) 9x 

I n j 
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(p,,  . . . ,p  ,,x  ).  By  the  implicit  function  theorem,  there  are  neighborhoods 
1 n-1  n 

mU,  of  (p,,...,p  ,)  and  of  x and  a C map  g :ty,  — 2C 

1 1 n-1  In  11 

such  that 

(2.  41)  A(p1 pn_r  Xr)  = 0 <=>  xn  = g(p1?  . . p^)  V(pr  . . . , x^  « ^ . 

Conversely,  x = g(p,,  • • • , P ,)  implies  rank  A(x,,  . . . , x ) < n. 

By  a continuity  argument,  we  can  shrink  and  to  h ^ and  % 2 

so  that  rank  A(Xj,  . . . , x^)  is  exactly  n - 1 and  assumption  (2.  22) 

is  satisfied  whenever  x = g(p  , • • • , P ,)  in  V,  xy,.  We  may  even 

include  in  the  bargain  the  fact  that  the  first  minor  of  A(x)  is  invertible, 

so  that  (Pj,  . . . , pn  j,  x^)  enjoys  all  the  properties  of  (Pj,  . . . , Pn_j>  xn)- 

By  (2.  38)  and  (2.39),  it  follows  that  Bp  /3x  = 0,  B^p  /Bx^  * 0, 

n n n n 

2 2 

Bz/Bx  = 0,  B z/Bx  * 0 at  every  point  (p.,  . . . , p .,  x ) « xy 
n ’ n 1’  ’ n-1  n 2 2 


such  that  x g(p,, 

n 1 

•••■pn-l)- 

It  follows  that: 

(2. 42)  Pn  = kj <Pj . • • • 

■pn-l,  + [xn*9(pl’ 

"Vl’V 

(2.  43)  z = k2(pi , . . 

' ’ Pn-l'  + 1 xn  ■ 9<pl 

pn-i^  p2^pl* ' ‘ 

• ■ pn-l’  xn) 

with 

(2.  44)  x^  = g(Pj,  . . . 

•p„-l>  =>hl(pl’'" 

•p„-l’xn,h2(pl 

pn-r  V * ° • 

The  point  of  V 

defined  by  (p^,  . . 

•'pn-l’xn  = 9(pl'-" 

, Pn))  yields 

pn  * kllpl pn-l’ 

and  z = k2(p1>  . . 

. , p .),  so  that  k, 
n-1  1 

and  k^ 

ue 

are  C functions. 

It  follows  from  the 

uC 

C division  theorem  of  Malgrange 

that  hj  and  h^  can  be  chosen  to  be 

00 

C functions  also. 
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Assume  that  h (p  , . . . , p^,  xj  > ° • Then  we  can  define 

y = [ x -glvTT  and  use  (p,,...,p  .,y)  as  a new  system  of  local 

n n 1 1 n-i  n 

coordinates  in  some  smaller  neighborhood  of  (p,  x,  z)  corresponding 

to  (p  , . . .,p  j,yn)  « t<3  X»3.  Equations  (2.  42)  and  (2.  4 3)  become 


(2.  45) 


Pn  ' kl(pl PnV  * Vn 


(2.46) 


2 -k2<pl’-**’pn-l)  = yn  h3(pl Pn-l’V 


with  (p,,...,p  ,)  < 7/,  and  y t This  implies  that  p - k be 

1 n-1  3 n 3 n i 

nonnegative.  Conversely,  whenever  p^  > k^,  we  can  solve  (2.45)  by 

y = ± •Jp  - k.,  getting  two  distinct  values  whenever  the  inequality  is 
n n 1 

strict;  possibly  shrinking  V to  V 4,  we  can  arrange  that  both  those 

values  are  in  so  that  Equation  (2.46)  becomes: 


(2.47) 


k2  = (Pn  - kj>  t,(P1 Pn.,,*  ''P,,  - V 


which,  together  with  (p,,...,p  .)  « V .,  completely  describes  tr  Jr  . 

1 n-1  4 xz  i. 

If  h,(p,,...,p  ,,x  ) should  be  negative,  then  we  simply  reverse 

l'*T  ’ n-1’  n 

p to  -p  , and  we  are  back  to  the  preceding  case.  So  formulae  (2.23) 
n n 

and  (2.24)  are  proved. 


For  commodity's  sake,  denote  by  Q the  set  of  points  (p^,  . . . , p^) 

such  that  p > k,,  and  by  2 its  boundary,  the  equation  of  which  is 
n 1 

p = k. . Formula  (2.24)  yields  along  2: 
n 1 


(2.48) 


9z  + 9z  9k^ 

9Pt  9Pj  dPj 


9z  + 9z_ 

^ P_  " »P„ 
n n 


= h . 


1 < 1 < n - 1 


It  follows  also  from  formula  (2.4)  that  with  any  p e £ and  any 
vector  tt ' = (tt|,  . . tt^)  pointing  to  the  interior  of  Q 
n-1  9k^ 

(i-e.  ~ n!  > 0)  we  can  associate  two  continuous  paths 

i = l Pi  1 

t - (p(t),  x(t),  z+(t))  and  t - (p(t),  x(t),  z_(t))  in  i.V  starting  at  (p,  x,  z) 

and  satisfying  g^t)  — tt'  as  t — 0.  From  Proposition  1.4  (taking  care 

that  x-  and  p-coordinates  are  interchanged)  it  follows  that,  when  t - 0 
dz  dz_ 

49)  (t)  — x • tt'  and  (t)  — x • ir'  . 

But  from  formula  (2.48)  we  get  directly 
dz  dz 

5°)  “TT  (t)  - rr  * TT1  and  — ~ (t)  - — • rr' 

dt  dp  9t  dp 

W*lere  dp  ^enotes  t^le  common  value  of  the  n-vectors  (2.48).  This 

yields  9^  ' = x ■ for  every  vector  tt'  in  some  half-space,  and 

hence  the  desired  formula  x = dz/9p.  ■ 

In  other  words,  .rf  is  not  defined  locally  for  p < k (d  . . . n ) 

n r*T  ,pn-r* 

In  the  region  p^  ^ ^(Pj,  . . . , Pn_j)i  there  are  two  well-defined  branches 
for  if.  Along  the  boundary  they  coincide  and  have  the  same  tangent 
hyperplane,  and  their  shape  away  from  the  boundary  is  given  by  the 
following  result: 

Corollary  2.7.  We  keep  the  assumptions  and  notations  of  Proposition  2.  6, 
and  we  set  = Pn  k(Pj,  • • • , Pn_j).  Then  i f can  be  expanded  near 
the  boundary  q =0  as: 
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(2.  51)  2 - k2lP, P„.|)  * 1nl  VP1 Pn-l’*  al(Pl'  ' ' ' ’ Pn-l’  ^ 1 4 °(qn' 

oo 

where  the  functions  k , a , a are  C . Moreover: 

c.  U 1 


(2.  52) 


i^(pl'-"’pn-l)  2 *,  ,or 


(2.  53) 


a0lpl Pn-1>  2 xn 


The  proof  consists  simply  of  replacing  h by  its  Taylor  expansion 

in  formula  (2.24).  We  see  that  the  two  branches  only  intersect  at  the 

boundary  p = k,  of  the  admissible  domain  p > k,  (this  is  true  even 
n 1 ~ 1 

in  the  special  case  where  a^  = 0,  because  then  the  third  order  term 

3 /2 

± a ,q  takes  precedence.  This  is  the  classical  "cusp"  situation, 

3 n 

so  that  Proposition  2.  6 can  be  loosely  stated  as  follows:  a simple 

inflexion  point  of  f gives  rise  to  a simple  cusp  of  Si- 

Of  course,  more  degenerate  inflexion  points  of  f given  rise  to 

more  complicated  situations  in  £f.  A classification  can  be  attempted 

along  the  lines  of  Proposition  2.6,  but  we  are  not  going  to  conduct  it 

any  further.  Let  us  only  point  out  that,  for  all  functions  f t 3,  where  3 

oo  n 

is  a dense  G,  subset  of  C ( 1R  ) in  the  Whitney  topology,  the  space 

o 

IRn  can  be  partitioned  as  U 2^  U where: 

consists  of  all  points  x where  A(x)  is  nondegenerate:  it  is 
an  open  subset  of  ftn. 

2ij  consists  of  all  points  x where  A(x)  has  rank  (n  - 1)  and 
satisfies  (2.22);  it  is  a codimension  one  submanifold. 


consists  of  all  other  points;  it  is  a stratified  subset  of 


V 

2 

codimension  > 2. 

Without  going  into  details,  this  follows  from  Thom's  transversality 
theorems.  So,  for  most  functions,  the  analysis  performed  thus  far 
describes  everything  up  to  codimension  two.  In  the  one-dimensional 
case,  n = 1,  that  means  precisely  everything.  Let  us  conclude  by  a 
simple  example. 

Define  a function  f on  the  real  line  by: 

(2.  54)  f(x)  = (x  + x2)2  . 

We  want  to  know  what  S f looks  like.  We  need  some  data  on  f 
which  are  summarized  in  the  following: 

f'(x)  = 4x(x  + l)(x  + "■)  = 4x3  + 6x2  + 2x 
f"(x)  = 12x2  + 12x  + 2 


X 

f(x) 

X 

«4-l 

II 

a 

f"(x) 

z = f'(x)x  - f(x 

- 00 

+ 00 

- 00 

* 

+ 00 

-1 

0 

0 

* 

0 

-0. 7887 

1/36 

0.  19-245 

0 

-0. 1796 

" 2 

1/1  6 

0 

* 

-1/1  6 

-0. 21 1 3 

1/36 

-0. 19245 

0 

0.0129 

0 

0 

0 

* 

0 

+ 00 

+ 00 

+ oo 

* 

+ oO 

-25- 


We  now  can  draw  the  graphs  of  f and  Xf  (Figures  1 and  2.  ) Note 
that  the  z-axis  p = 0 intersects  Xf  at  the  simple  point  z = -1/16  • 
and  the  double  point  z = 0.  This  means  that  there  are  two  distinct 
tangents  to  f with  slope  p = 0:  the  first  one  is  tangent  to  f at 
x = -1/2  only,  the  second  one  is  tangent  to  f both  at  x = -1  and 
x = 0.  From  formula  (2.17),  the  tangent  to  Xf  at  (p  = 0,  z = -1/16) 
has  slope  -1/2,  and  the  two  branches  of  X f which  intersect  at 
(p  = 0,  z = 0)  have  distinct  tangents  of  slopes  -1  and  0 respectively. 

Moreover  X f features  two  cusps  at  (0.1945,  -0.1796)  and 
(-0.1945,  0.0129).  By  Proposition  2.6,  the  tangents  at  those  cusps  are 
well-defined,  and  have  slopes  -0.7887  and  -0.2113  respectively. 

Note  the  parametric  equations  for  Xf: 

fp  = 2x(x  + l)(2x  + 1) 
z = x(x  + 1)(  3x  + x) 

so  that  the  graph  of  Xf  is  the  =emi-algebraic  set  obtained  by  writing 
that  the  two  algebraic  Equations  (2.55)  have  a common  solution  in  x- 
i.e.  by  eliminating  x between  the  two  equations. 
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5 III.  Lxtremization  problems  and  duality. 

Whenever  V is  a subset  of  IRn  x IRn  x IR,  we  shall  denote  by: 

( P)  ext  V 

the  problem  of  ietermimng  all  couples  (x,  z)  < IRn  x ]R  such  that 

(3.1)  (x,  0,  z)  * V . 

(P)  will  be  termed  an  extremization  problem,  and  any  couple  (x,  z) 
satisfying  (3.1)  will  be  called  a solution  of  (P).  The  value  of  (p),  denoted 
by  (ext  P},  will  be  the  set  of  all  z ( E such  that  there  is  an  x t IRn 
with  (3.1)  satisfied. 

An  important  special  case  occurs  when  V is  the  Lagrangian 

go  n 

submanifold  associated  with  some  C function  f : IR  — R: 

(3.2)  V = {(x,  f'(x),  f(x))  | x < IRn)  . 

In  that  case  formula  (3.1)  becomes: 

(3.  3)  f'(x)  = 0,  z = f(x) 

so  that  (P)  is  simply  the  problem  of  determining  the  critical  points  and 
values  of  f.  We  shall  write  it 


(P) 


ext  f(x) 
x 


and  call  it  an  unconstrained  smooth  extremization  problem. 
Another  important  special  case  occurs  when: 


(3.4)  V = {(x,f'(x)  - 5,  \ g'(x),  f(x))  I g (x)  = 0,  X t IR,  1<  j < k) 

j=l  ‘ ‘ J J 

oe  n 

where  f and  the  g.,  1 < j < k,  are  C functions  on  IR  . We  set: 
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(3.  5) 


S = TT  V fxl  g (x)  = 0,  1 < j < k)  . 
x j 


Lemma  3.1.  If  the  g'(x),  1 < J < k,  are  linearly  independent  at  every 
x « S,  then  S is  a closed  submanifold  of  IK  and  V is  a Lagrangian 
submanifold  of  IRn  x lRn  x IR. 

Proof.  The  fact  that  S and  V are  closed  (n  - k)-  and  n-dimensional 
submanifolds  follows  easily  from  the  implicit  function  theorem.  We 
check  condition  (1.  3)  for  V: 


(3.6) 


i^  = df(x)  - (f'(x)  - 2X  g'(x))dx 

= (df(x)  - f'(x)dx)  + Z\.gJ(x)dx  . 


The  first  term  vanishes  identically,  and  along  V we  have  gj(x)dx  - 0 
since  g^(x)  is  a constant.  ■ 

The  solutions  of  (p)  are  all  couples  (x,  ffx))  such  that: 


(3.7) 


x t S and  3\.,  . . • ,X.  : f'(x)  - Yj  x,9'(x)  = 0 
1 K j=l  J 1 


If  the  g|(x),  1 < j < k,  are  linearly  independent  at  every  point 
xt  S,  condition  (3.  5)  means  that  x is  a critical  point  of  flg,  the 
restriction  of  f to  S.  For  that  reason,  we  shall  write  (P)  as: 

C ext  f(x) 

(P)  < 

^ g (x)  =0,  1 < j < k 

and  call  it  a constrained  smooth  extremization  problem. 

Any  critical  point  of  a smooth  convex  (or  concave)  function  is  a 
minimum  (or  a maximum).  For  that  reason,  the  various  extremization 
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problems  we  stated  reduce  to  optimization  problems  when  f is  convex 


ior  concave)  and  the  g^  linear.  So  extremization  is  a natural  generaliza- 
tion of  optimization  to  the  nonconvex  case.  Now  it  is  a well-known  fact  that 

S' 

there  is  a duality  theory  for  convex  optimization  problems,  and  we  want 

to  extend  it  to  nonconvex  extremization  problems. 

From  now  on  we  are  given  a linear  map  A : IRn  — IRm . We  shall 

n n * m m ^ 

denote  by  x,  p,  y,  q the  vectors  of  1R  ,(IR  ) , 1R  ,(1R  ) respectively. 

With  any  subset  V of  lRn  m x ]Rn  m x ]R  we  associate  the  subset  V 

A 

of  IRn  x IRn  x IR  defined  by: 

(3.  8)  VA  = {(x,  p + A q,  z)  I (x,  Ax;p,  q;z)  t V}  . 

* m * n * 

Applying  this  definition  to  the  transpose  A : (IR  ) — (IR  ) ( ), 

# n+m  n+m 

and  to  any  subset  V of  IR  x IR  x IR,  we  get: 

( 3.  9)  V = f(q,y  + Ax,  z)  l(A  q,  q;x,  y;z)  < V } C IR™  X IR™  X IR  . 

A 

We  now  state  the  main  result  of  this  section: 

Theorem  3 . 2 . Let  A : Dln  - lRm  be  a linear  map  and  V any  subset  of 

IRn  x IRn  m x IR.  Consider  the  extremization  problems: 

(P)  ext  V. 

A 

(P  ) ext  (XV)  # . 

-a" 


The  formulae 


(3.10)  (x,  Ax;-A  q,q;z)  t V,  z'  = -z 

(3.11)  ( -A  q,q;x,Ax;z')  c TV,  z = -z' 

are  equivalent.  Whenever  (x,  z)  is  a solution  of  (P),  the  set  of  (q,z') 
satisfying  (3.11)  or  (3.12)  is  nonempty,  and  all  of  them  are  solutions  of 
(P  ).  Whenever  (q,  z')  is  a solution  of  (P  ),  the  set  of  (x,  z) 
satisfying  (3.11)  or  (3.12)  is  nonempty,  and  all  of  them  are  solutions  of  (P). 

Proof.  To  say  that  (x,z)  is  a solution  of  (P)  means  that  there  exists 
(p,  q)  such  that: 

$ 

(3.12)  (x,  Ax;p,q;z)  < V,  p + A q = 0 
which  we  may  write  in  a more  symmetric  form: 

(3.13)  (x,  y;p,  q;z)  < V,  y - Ax  = 0,  p + A q = 0 . 

Applying  the  Legendre  transformation,  this  becomes: 

(3.14)  (p,  q;x,  y;px  + qy  - z)  « TV,  y - Ax  = 0,  p + A q = 0 . 

The  last  two  equations  imply  that: 

(3.15)  z'  =px+qy-z  = -Aq*  x + q-  Ax-z  = -z 
and  formula  (3.14)  becomes: 

$ 

(3.16)  (p,q;x,y;-z)  «XV,  y-Ax  = 0,  p + Aq-0. 

Breaking  the  symmetry,  we  get: 

(3.17)  (“A  q,q;x,y;-z)  t TV,  y - Ax  = 0 

* 

which  means  precisely  that  (q,  -z)  is  a solution  of  (P  ).  Since  the 
Legendre  transformation  is  an  involution,  formulae  (3.12)  and  (3.17)  are 


equivalent,  and  set  up  a one-to-one  pairing  between  solutions  (x,  z) 

of  (P)  and  Iq,  - z)  of  (P  ).  But  (3.12)  is  just  ( 3.  10),  and  (3.17)  is  ( 3. 11).  • 

The  following  is  an  easy  consequence  of  the  fact  that  the  Legendre 

* 

transformation  $ and  the  operation  A - -A  are  involutions: 

$ $ 

Corollary  3.  3.  (p  ) = (P). 

* 

Problems  (P)  and  (P  ) will  be  said  to  be  dual  to  each  other. 

Another  easy  consequence  of  Theorem  3.  2 is  the  following: 

ajt 

Corollary  3.4.  (extp.'  = -{extP  ) . 


Theorem  3.2  is  more  readily  understandable  in  the  case  of 
unconstrained  smooth  extremization  problems.  It  reads: 

Proposition  3.  5.  Let  A : ]Rn  — ]Rm  be  a linear  map  and  f : IRn  + m - 1R 
a C^  function.  Consider  the  extremization  problems: 

(P) 


(P  ) 


ext  f(x,  Ax) 
x 

jJC 

ext  S f(-A  q,  q)  . 
q 


The  formulae: 


(3.18) 


-A  q = f'(x,Ax),  q = f' (x,  Ax),  z'  = -f(x,  Ax) 
x y 


set  up  a one-to-one  pairing  between  solutions  (x,  f(x,  Ax))  of  (P)  and 

* 

(q,  z')  of  (P  ).  Whenever  the  matrix  of  second  derivatives  f"  has 

* 

rank  (n  + m)  at  (x,  Ax),  there  is  a neighborhood  V of  (-A  q,q)  and 


This  follows  easily  from  taking  V - V , the  Lagrangian  submanifold 

associated  with  f,  in  Theorem  3.2.  The  last  part  is  a consequence 

of  Proposition  2.5.  Note  that  relations  analogous  to  3.20  hold  whenever 

(£f)'  can  be  defined  in  a consistent  way  at  (p,  q;z');  this  would  be 

the  case  for  the  cusp  points  described  in  Proposition  2.6. 

Let  us  give  an  important  special  case: 

Corollary  3.6.  Let  <p  : IRn  -*  iR  and  if'  : JR  -»  1R  be  C functions, 

and  consider  the  extremization  problems: 

(p)  ext  <?(x)  + iL( Ax) 

x 

* * 

(P  ) ext  £ *>(-A  q)  + JV(q)  . 

q 

$ 

Then  {extp}  = {ext  P },  and  there  is  a one-to-one  pairing 

sjt  # 

between  solutions  (x,<p(x)  + i}>(Ax))  of  (P)  and  (q  , z')  of  (P  ), 
described  by  the  relation: 

(3.20)  -A  q = v'(x),  q = 4»'(Ax),  z'  = <p(x)  + 4>(Ax)  . 

Whenever  <p"  has  rank  n at  x and  i|j"  has  rank  m at  Ax, 

* 

there  are  neighborhoods  “U  and  K ^ of  -A  q and  q,  selections 

£ <p  and  £_  ib  of  Z<P  and  lib  over  fy.  and  V,-,.  such  that: 
li  l(  1 2 

(3.21)  £ V>(-A'q)  + J^4>(q)  = v(x)  + 4;(Ax) 

(3.22)  x = (£  v)'(-A*q),  Ax  = (^^'(q)  . 

We  now  give  two  examples  of  applications  of  Theorem  3.2.  They 
are  both  related  to  the  problem  of  finding  the  eigenvectors  and  eigenvalues 


of  a self-adjoint  operator:  we  write  it  as  an  extremization  problem  in 


two  different  ways,  and  dualize  both  of  them. 

Let  us  start  with  the  constrained  smooth  extremization  problem: 


ext  li  Ax  It  2 


A solution  to  (P)  is  a couple  (x,z)  such  that: 

(3.23)  !!x  II2  = 1,  3 V * IR  : AAx  - \x  = 0 

(3.24)  z = II  Ax  j|  ^ = X 

i.e.  x is  an  eigenvector  of  A A and  z is  the  corresponding  eigenvalue. 
Consider  the  subset  V C lRn+m  x IRn  + m x IR  defined  by: 

(3.2  5)  V = {(x,  y;-2\x,  2y;  lly  II2)  | l|  x II  2 = 1,  X t IR}  . 

By  Lemma  3.1  it  is  a Lagrangian  submanifold.  It  is  clear  that 
problem  (P)  is  simply  ext  V^.  For  commodity's  sake,  we  will  cut 
out  part  of  V;  indeed,  it  is  apparent  from  formula  (3.24)  that  X>  0 
for  any  solution  (x,  z)  of  P . So  we  introduce  the  "Lagrangian  sub- 
manifold with  boundary1’: 

(3.26)  V*  = { (x,  y;- 2X x,  2y;  ||  y ||  2)  | ||x  || 2 = 1,  X > 0 } 

and  we  state  problem  (p)  as: 

(P ) ext  . 

The  Legendre  transform  of  V1  is  again  a Lagrangian  submanifold 
with  boundary.  Going  through  the  computations,  we  write  it  as  a disjoint 
union  5.  V - Q U I’,  where  r is  the  boundary: 


i 


Mm. . . J 


(3.27) 


')  - ((p,  q;-p/||p  I!,  q/2;- (|p  II  + ||q  II 2/4)  |p  * 0 } 

(3.28)  r = i(0,q;4,q/2;llq||2/4)l  IU  II 2 = 1}  . 

TV  is  clearly  associated  with  the  function  (p,  q)  — -Up  II  + l|ql|2/4. 
The  function  p * - llpli  is  not  differentiable  at  the  origin,  but  let  us 
agree  that: 

(3.29)  ^ (—  llpli)  lp_o  = U « ^1  Hill2  = 1)  • 

This  being  agreed  upon,  we  can  now  state  the  dual  problem  (p  ) in 
the  following  way: 

(P  ) ext  - Ha  q ||  + || q ||2/4  . 

q 

Theorem  3.2  implies  that  whenever  (q,  - II A q II  + ||q||2/4)  is 
a solution  to  (P  ),  all  couples  (x,  I i Ax  1 1 2 ) given  by: 

( 3.  31)  x = A q/  II A q ||  if  A q * 0,  Ax  = q/2,  (I x II 2 = 1 

(3.32)  llAxll2  = ||A*q||  - llq||2/4 

* 

are  solutions  to  (p);  in  other  words  x is  an  eigenvector  of  A A with 

norm  one,  and  i! A q II  - l|q||2/4  is  the  corresponding  eigenvalue.  For 

* 

instance,  formula  (3.29)  shows  us  that  (0,0)  is  a solution  to  (P  ) 
provided  there  exist  £ c ]Rn  with  ll^ll2  = 1 and  A£  = 0.  Formulae  (3.31) 
and  (3.  32)  then  yield  the  trivial  fact  that  every  such  £ is  an  eigenvector 
of  A A with  eigenvalue  0.  Note  as  a conclusion  that  -(ext  p } is  just 
the  spectrum  of  A A. 

We  now  treat  the  same  problem  in  another  way.  We  define  a subset 
W of  fRn,m  X Bn*m  X K by: 
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(3.33)  W { (x,  y;-2x  il  y ||  /|!x|l  , 2y/l|x  |i  ; ||y  ||  / 1|  x ||  ) | x # 0 } 

U ((0,  o:o,  n;0)  I n < ir™}  . 

It  can  be  checked  that  W is  a Lagrangian  submanifold.  We 
associate  with  it  the  extremization  problem: 

(P)  ext  W„ 


which  we  state  somewhat  loosely  as: 

(P)  ext  II  Ax  II  2/ 1| x ||2. 

* 

Of  course,  solving  (p)  is  just  looking  for  the  eigenspaces  of  A A. 
We  now  construct  the  dual  problem  (p  ).  A simple  computation  yields: 
(3.34)  r W - {(p,  q;-Zp/||q  \\Z,  Zq  ||  p ||2/llq  ||4;- lip  l|2/|lq  II  2)  I q * 0} 

U {(0,  0;n,  0;0)  | tt  t lRn } . 


The  dual  problem  (P  ),  which  is  ext  W ^ will  be  stated 

-a'" 

somewhat  loosely  as: 


(3-  35) 


We  leave  it  to  the  reader  to  see  what  becomes  of  formulae  ( 3. 10)-(3. 11). 
They  tell  us  essentially  that  the  eigenvalues  of  A A and  AA  coincide  - 
a trivial  fact. 

We  conclude  this  section  by  pointing  out  a technicality:  even  if  V 
is  a Lagrangian  submanifold  of  IRn  + m x IRn  m x IR,  the  set  need 
not  be  a Lagrangian  submanifold  of  !Rn  x IR™  x IR.  Indeed,  it  need  neither 
be  closed  nor  be  a submanifold.  As  a simple  example,  take 


(3.  36) 


V = {(x,  y;-y/x  , l/x;y/x)  I x * 0 
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a Lagrangian  submanifold  of  IR  x 1R  x If.  Setting  A : x — mx,  we  get: 
(3.  37)  VA  = { (x,  0,  m)  I x * 0} 

which  is  not  closed  in  IR  x IR  x IR. 

However,  we  have  the  following: 

Lemma  3.  7 ■ If  V is  a Lagrangian  submanifold  and  if  is  a closed 

submanifold,  then  is  Lagrangian. 

Proof.  We  check  condition  (1.  3)  for  V^: 

jj(  jjc 

(3.  38)  iyU)  = dz  - (p  + A q)dx 

= dz  - pdx  - qd(Ax) 

which  is  zero  since  (x,  Ax;p,  q;z)  t V,  and  the  restriction  of  w to  V 
vanishes.  ■ 

Note  also  that  if  V is  the  Lagrangian  submanifold  associated  with 
oo  n m 

a C function  f : IR  x IR  — IR,  then  is  the  Lagrangian  submanifold 

oo  n 

associated  with  the  C function  x f(x,  Ax)  from  IR  to  IR  - a fact 
we  have  used  repeatedly  in  this  section. 
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§ IV.  Applications  n • : i < 1 ;•  .lus  of  variations. 

— n oo 

From  now  on,  12  IR  will  be  an  n-dimensional  C -submanifold 

with  boundary  r.  We  set  12  12  - r,  an  open  subset  of  IRn,  we 

endow  12  with  the  Lebesgue  measure  dw  and  r with  the  induced 

(n  - 1)  dimensional  measure  dy. 

2.  rn 

We  consider  a continuous  linear  map  A : V — E,  where  E - L (12;IR  ) 

2 k 

and  V is  some  Hilbertian  subspace  of  H = L (12;1R  ) (i.e.  V is  a 

linear  subspace  of  H endowed  with  some  Hilbertian  structure  such  that 

the  inclusion  mapping  V — H is  continuous).  We  assume  that  there 

is  some  Hilbert  space  T and  some  continuous  linear  map  t : V — T 

such  that  t is  surjective  and  VQ  = T *(0)  is  dense  in  H.  In  practical 

examples,  A will  be  some  differential  operator,  VQ  will  be  £(12), 

00 

the  closure  in  V of  the  set  of  C functions  with  compact  support  in 
12,  and  T will  associate  with  every  function  in  V its  "trace"  on  the 
boundary  r.  We  shall  state  an  abstract  Green's  formula  for  later  use: 
Theorem  4 . 1 . There  exist  a Hilbertian  subspace  V of  E,  and 

* $ Sjt  # 

continuous  linear  maps  A : V — H and  r : V — T',  the  topological 
dual  of  T,  such  that,  for  every  x e V and  q « V , we  have: 

(4.1)  (q,  Ax)  - (A  q,x)  = < t q,Tx) 

where  (■,•)  denotes  scalar  product  in  and  <*,*)  denotes  the 

duality  pairing  between  T'  and  T. 
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Urn 


t * 


We  now  turn  to  extremization  problems  in  the  calculus  of  variations, 
rrom  now'  on,  we  are  given  a family  W , w t ft,  of  Lagrangian  sub- 

U) 

manifolds  of  IR  x IR  x IR  and  we  denote  by  F (x,  y)  the 

associated  characteristic  maps.  Moreover,  we  are  given  a convex  lower 

semi-continuous  function  4>  : T -*  IR  U {+*};  as  usual  in  convex  analysis, 

its  subdifferential  will  be  denoted  by  b<b . We  now  state: 

Definition  ±.2 . The  calculus  of  variations  problem  (*) : 

(p)  ext  f F (x(w),  Ax(oj))dw  + ®(Tx) 

xt  V n w 

consists  in  looking  for  all  mappings  w -*(x(w),  p(w),  q(u>),  z(w))  from 

to  IRk+m  xIRk  + mx  IR  such  that: 

* 1 

(4.2)  x < V,  q t V , z « L 

(4.3)  (x(u),  Ax(u);-A  q(oj),  q(w);z(w))  i W^  for  a.e.  o>  « 

Jjc 

(4.  4)  t q t -94-  (tx)  . 

Any  pair  (x,  z)  c V x L*  such  that  there  exists  q « V satisfying 
id  ?)— (4.  41  Will  he  called  an  extremal  of  (p).  The  number  4 defined  by: 
(4.5)  4 = / z(u)dw  + #(tx) 

n 

will  be  the  associated  value  of  (P).  The  set  of  values  of  problem  (P) 
will  be  denoted  by  {ext  p). 

The  motivation  for  this  definition  is  clear.  In  the  case  where 
F (4,  n)  = f(w;4,  n),  a function  which  is  C in  (4,  n)  for  almost  every 

UJ 


(1) 


Henceforth  denoted  by  C.  V.  problem. 


k m 

oj  t 12 , and  measurable  in  u>  for  every  (|,  n)  c IR  x IR  , then 
Equations  (4.2)-(4.6)  become: 

(4.6)  f'  (ui;x(w),  Ax(w))  +A  f' (u);x(w),  Ax(w))  = 0 a.e. 

§ n 

(4.7)  t [f'(x,Ax)|  t -d4’(  x)  . 

n 

Equation  (4.  6)  is  the  Euler- Lagrange  equation  on  12  associated 
with  the  integral: 

(4.8)  J f(w;x(u>),  Ax(w))doo 

12 

and  formula  (4.7)  yields  the  so-called  transversality  conditions  on  the 
boundary  r.  In  the  case  where  f is  convex  in  (|,  n)  for  every  w, 
those  are  necessary  and  sufficient  conditions  for  optimality.  If  f is 
net  convex,  but  satisfies  some  growth  condition  of  infinity,  we  get  the 
first-order  conditions  for  stationarity . 


4>  is  the  Fenchel  conjugate  of  <t>  in  the  sense  of  convex  analysis: 
<M6')  = sup{<6,6'>  - 4>(6)  1 6 c T},  V&'  e T*  . 

f 
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# 1 V 

with  value  -t,.  Conversely,  let  (q,  z')  t V xL  be  an  extremal  of  (P  ) 

with  value  then,  for  any  x t V satisfying: 

$ 

(4.9)  (-A  q(w),  q(w);x(w),  Ax(w);z'(m))  c f W fora.e.  w c 

U) 

(4.10)  rx  i at  (-t  q) 

£ 

(x,  z1  + xA  q - qAx)  is  an  extremal  of  (P)  with  value  — C, ' . Hence: 

$ 

(4.11)  {extp}  = -{extp  } . 

Proof.  The  pointwise  equation: 

(4.12)  (x(u),  Ax(u);-A*q(w),  q(w);z(w))  « W 

U) 

can  be  written: 

(4.13)  (-A  q(w),  q(w);x(w),  Ax(w);-x(w)A  q(w)  + Ax(w)q(w)  - z(u>))  t jW  . 

it) 

Moreover,  formula  (4.4)  can  also  be  written: 

$ 3jc 

(4. 14)  tx  t d<J>  (-t  q)  . 

* 

But  Equations  (4.13)  and  (4.14),  together  with  x e V,  q e V , z«L 
simply  mean  that  (q,  -xA  q + Axq  - z)  is  an  extremal  of  (p  ).  The 
associated  value  is: 

(4.15)  - f (-x(w)A  q(w)  + Ax(u>)q(w)  - z(w))dco  + 4>  (-t  q)  . 

Using  Green's  formula: 

(4.16)  t,1  = - f z(w)dw  + <T*q,  tx)  + <&*(-T*q)  . 

n 

Making  use  of  Equation  (4.14),  this  becomes: 

(4.17)  V = - / z(w)du>  - *(Tx)  = . 

n 

Hence  the  first  part  of  the  theorem.  The  converse  is  proved  along 


the  sames  lines. 


I 


Typical  instances  of  such  a mapping  A : V - E are 


(4.18) 

(4.19) 


grad  : h\«)  - L2(tt;lRn) 
A : H2(«)  - L2(tt;IR)  . 


In  the  first  case,  T is  h'/V  ),  and  Green's  formula  reads: 


(4. 20) 


f (grad  x • q + x • div  q)dw  = f n • qx  d\  . 

u r 


In  the  second  case,  T is  H (T),  and  Green's  formula  reads: 
(4.  21)  f ( Ax  • q + x • Aq)du>  = f (q  n • grad  x + x n • grad  q)dy  . 

b b 


In  both  cases,  we  could  define  $ as: 


(4.22) 


<S(6)  = 0 if  b = 6 , +»  otherwise 


which  gives  a Dirichlet  condition  (fixed  boundary  values).  We  could 
also  define: 


(4.23) 


*(&)  = 0 if  f 6 = 0,  +<»  otherwise 

r 


which  is  a kind  of  periodicity  condition. 


Let  us  give  an  example, 


ext  f f(  to;x(u>),  grad  x(w))du> 

b 

x t h\s2),  f x(y)dy  = 0 

r 


has  the  following  dual: 


ext  f If(«;-div  q(w),  q(w))da> 

n 

q ( H(£2;div),  q = constant  on 


where  H(Q,div)=  {u  < L2( O,  IRn)  j di v u t L2(i2,IRn)}.  The  task  of 
rewriting  ( 4.  2)-(4.  4)  and  (4.9)— (-4. 11)  is  left  to  the  reader 

We  are  now  going  to  show  that  we  can  get  simultaneously  the 
extremals  (x,  z)  of  (p)  and  the  extremals  (q,  z')  of  (P  ) from  the 
extremals  of  a single  C.V.  problem: 

Proposition J.jl.  Consider  the  C.V.  problems: 

jjt  ji,  jjj 

(D  ) ext  J |-A  q(u>)  • x(w)  + q(u>)y(u>)  - F (x(w),  y(w))J  + 4>  (-t  q) 

(X,  y,  q)  < O w 

VXE  XV 

(D  ) ext  f [ p(u))x(w>  + q(w)  • Ax(co)  - IF  (p(w),  q(«))  | + 4>(tx)  . 

(x,  p,q)«  Q w 

VXE  XV* 

The  following  are  equivalent  statements: 

(a)  (x,  y,  q,  z')  is  an  extremal  of  (D) 

( b)  (x,  p,  q,  z)  is  an  extremal  of  (D  ) 

(c)  (x,  q,  z)  satisfy  (4.  2)-(4.  4) 

(d)  (q,  x,  z')  satisfy  (4.  9)-(4. 10)  and  z'  < L1 

* 

with  z+z'  = -Aq*x  + q-  Ax.  In  particular  (x,  z)  is  an  extremal 

jjr 

of  (P)  and  (q,z‘)  an  extremal  of  (P  ). 

Proof.  We  have  already  shown  that  (c)  and  (d)  are  equivalent.  We  shall 

be  content  with  proving  that  (a)  and  (c)  are  equivalent:  the  proof  that 

(b)  and  (d)  are  equivalent  goes  along  the  same  lines. 

Problem  (D)  can  be  written  as: 

O)  ext  J 3 (x(w),  y(u>), -A*q(w),  q(w))dw 

(x,  y,  q)c  Q 
VxExV’’ 
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I 


where  3 is  the  characteristic  map  associated  with  the  Lagrangian 

sjJ 

,,  . c _2k+2m  _2k+2m  _ , ,,  , , 

submanifold  % of  1R  x ]R  x 1R  defined  by: 

uJ 

(■4.24)  y - ((4,  h.  ",  p;iT  - a,  p - rf  £,  n;ir£  + pp  - (,)  | 

SjJ 

U i iRk,  p i iRm , ( 4 , n;^,T  ;0  « w } . 

u> 

X 

We  now  apply  Definition  4.  2 to  the  Hilbert  space  v = V x E x V 

* 

and  the  map  Q : V — E defined  by  <7(x,  y,  q)  = -A  q;  its  adjoint  will 
be  the  map  O : V - Hx  ExH  defined  by  d (x')  = (0,0, -Ax'). 
Conditions  (4.2)-(4.4)  then  become: 

(4.25)  x ( V,  y i E,  q f V%'  i V,  z1  e L1 

(4.26)  (x(oj),  y(w), -A  q(w),  q(w);0,  0,  x'(cj),  Ax'(w);z'(cj))  * > a.e. 

U) 

(4.27)  tx'  t d<t>  (-rq)  . 

So  (x,  y,  q,  z')  ‘ V x E x V XL  is  an  extremal  of  (D)  if  and 
only  if  there  exists  x'  t V such  that  (4.26)  and  (4.27)  are  satisfied. 

Now,  comparing  (4.26)  with  (4.24),  we  get: 

* 


(4.28) 

(4. 29) 
(4.  30) 
(4.  31) 
(4.  32) 
(4.  33) 

J 

(4.  34) 


-A  q(u)  = a 
q(w)  = t 
x'(w)  = x(u>) 

Ax'(cj)  = y(w) 

* 

z'(w)  = -A  q(w)  • x(w)  + q(w)y(w)  - t, 
(x(w),  y(w);o,  T;t.)  c w . 


All  this  boils  down  to: 


(x(w),  Ax(u>);-A  q(w),  q(oi);z(w))  < W a.e. 

U) 


with  z(w)  + z'(w)  = -A  q(w)  • x(ui)  + q(w)  • Ax(u).  Taking  (4.  30)  into 


account,  (4.27)  becomes: 
(4.  35) 


* & 

tx  t 'd<f  (-t  q) 


which  can  be  inverted  to: 


(4.  36) 


-t  q e 9<t>( tx) 


But  (4.  34)  and  (4.  36)  are  just  (c),  and  we  have  proved  our  claim. 

Proposition  4.4  can  be  considered  a smooth  version  of  the  saddle- 

point  property  for  Lagrange  multipliers  in  convex  optimization.  Note 

oo 

that  in  the  case  where  F (£,  g)  = f(w;4,  q),  measurable  in  w,  C in 

U) 

* 

(£,  rj),  problem  (P  ) involves  S f(w;£,  n)  which  typically  is  multivalued 
and  cusped;  working  with  problem  (D)  is  a way  of  circumventing  this 
inconvenience  at  the  cost  of  increasing  the  dimension. 

We  now  apply  this  idea  of  "smoothing  out"  Legendre  transforms  to 
another  example. 

Proposition  4.5.  We  are  given  a C function  <p  : [ 0 , T ) x IRn  - IR, 
a measurable  function  f : [ 0,  T)  — IRn,  and  a point  £q  f IRn.  We 
consider  the  differential  equation: 


— + x)  = f,  a.e.  on  ( 0,  Tj,  x(0)  = 


and  the  C.  V.  Droblems: 


ext  / [ v(t,  x)  + t;f  - ~ ) + x(~^  - f))dt 
x « H*(0,  T;lRn),  x(0)  = £ 


ext  J Mt,  x)  - <p(t,  y)  + (—  - f)(x  - y)  |dt 
x < H*(0,  T;IRn),  y e H^O,  T;lRn),  x(O)  = y(O)  = . 


If  Equation  (£*)  has  no  solution,  then  problems  (p)  and  (D) 
have  no  extremals.  If  Equation  (£)  has  a solution  x,  then  problem  (p) 
has  a unique  extremal  (x,  0),  and  problem  (D)  has  a unique  extremal 
(x,  x,  0). 


Proof.  Problem  (D)  arises  from  problem  (p)  by  replacing  f<p(f  - — ) by 

at 

dx 

y(f  - — ) - <p( y),  i.e.  by  smoothing  out  that  part  of  the  integrand  which 
is  a Legendre  transform.  Proposition  4.  4 does  not  readily  apply  to  this 
case,  so  we  give  a direct  proof. 

An  extremal  (x,y,  z)  of  ( r>)  is  defined  by  the  Euler  equations: 


(4. 37) 


, . . dx  , d , , 

+ dT  ‘ 1 “ dJ|x‘  yl 


(4. 38) 


-^(t,y)  - 77  + f = o 


and  the  boundary  conditions  x(0)  = y(0)  = xQ.  Together,  they  yield  the 
system  of  differential  equations  on  [ 0,  T] : 


^ + «^(t,x)  = f,  y(0)  = xQ 


) 77  + y)  = f,  *(°)  = x0  . 

Now  this  is  to  be  compared  with  equation 


— + ^(t,  x)  = f,  x(0)  = xQ  . 


The  assumptions  on  imply  that  both  system  (4.  39)-(4. 40)  and 
equation  (C ) have  at  most  one  solution.  If  x is  the  solution  of  (f), 
obviously  (x,  x)  is  the  solution  of  (4.  39 ) - ( 4 . 40).  Conversely,  if  (x,  y) 
is  a solution  of  (4.  39)-(4.40),  then  so  is  (y,  x);  from  the  uniqueness, 
it  follows  that  x - y,  obviously  the  solution  of  (P).  Writing  x = x = y = y 
in  the  integrand,  we  see  that  it  is  identically  zero  We  have  proved  the 
equivalence  of  equation  (P)  and  problem  (D). 

The  equivalence  of  problems  (P)  and  (D)  goes  along  the  lines  set 
up  in  Proposition  4.4.  Indeed,  equation  (4.38)  means  simply  that: 


(4.41) 


. , dx  , , dx , 

-vit,  y>  - i^r  - = r ) 


and  the  integrands  in  (p)  and  (D)  become  equal.  With  x = y,  formula  (4.41) 
yields,  with  a slight  misuse  of  notations: 


(4. 42) 


[iVl^t,  f - = x 


and  the  Euler  equation  for  (p)  turns  out  to  be  exactly  equation  (P).  ■ 

Note  that  we  have  defined  directly  the  extremals  of  a problem  in  the 
calculus  of  variations,  without  reference  to  any  extremization  problem. 

This  is  because  the  natural  extremization  problem  involved  is  infinite- 
dimensional, and  the  results  of  the  preceding  sections  do  not  extend 
readily  to  this  case;  indeed,  smoothness  assumptions  which  are  natural 
in  finite  dimensions  become  preposterous  in  this  new  setting.  In  some 
particular  cases,  however,  it  can  be  made  to  work.  Let  us  give  an  example, 
which  will  be  recognized  as  an  infinite-dimensional  version  of  the 
example  concluding  Section  III. 
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J — 


(4.  -43) 


We  consider  the  space  V Hl(Q)  and  the  function: 

f : V\{0}  x L2(n)n  - ]R 
(4.  44)  f(x,  y)  = | y | 2/ 1 x 1 2 

II  2 00 

with  ! • | denoting  the  L -norm.  Obviously  f is  a C function,  with: 

(4.45)  p = f^(x,  y)  = -2x|y|2/|x|4  t L2(Q) 

(4.46)  q = fy(x,  y)  = 2y/|x|2  < L2(ft)n  . 

We  now'  set: 


(4. 47)  y = grad  x 

to  get  the  extremization  problem: 

7 ext  I grad  x|2/|x|"' 

(P)  < i 

i x c HQ(f2),  x # 0 . 

Let  us  write  out  the  equation  for  a critical  point,  taking  into  account 
the  fact  that  the  transpose  of  grad  : Hq(S7)  — L2(fi)n  is  -div  : L2(n)n  — H (U): 
(4.  48)  0 = p - div  q = -2(x  Igrad  x | 2/|x  1 2 + div  grad  x)/|x  | 2 . 

Note  that  Igrad  x|  cannot  be  zero  unless  x is,  so  (4.48)  becomes: 


2 


(4.49)  x = - ^ Ax,  x 4 0 . 

Igrad  x | 

In  other  words,  the  solutions  of  (p)  are  the  pairs  (x,  l/\)  where 
is  a nonzero  eigenvalue  of  the  Laplacian  under  homogeneous  boundary 
conditions,  and  x any  nonzero  eigenvector. 

To  get  the  dual  problem,  we  note  that  (4.  45)  and  (4.  46)  are 
invertible  whenever  y * 0,  yielding: 
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(4.  SO) 


x = -2p/|q|2,  y = 2qlp|2/|q)4  f 


so  we  are  in  the  particularly  simple  case  where  the  Legendre  transforma- 
tion is  one-to-one.  Equations  (4.48)  and  (4.47)  become: 

(4-  51)  p = div  q « L2 

(4-52)  2(q  |p  | 2/  |q  j 2 + grad  div  q)/|q  | 2 = 0 . 

But  this  means  exactly  that  q * 0 is  a critical  point  of  the 
2 2 

function  q — - Idiv  q | /(q  | over  the  space: 


(4*53)  H(£2;div)  = {q  « L2(Q)n|divq«  L2(Q)}  . 

Finally,  we  get  the  dual  problem: 

ext  - |div  q I 2/lq  1 2 
q t H(S2;div) 

with  the  usual  relationship  (4.  45)-(4.  46)  or  (4.  50).  Note  in  particular  that: 
(4-54)  (ext  p}  = -{extp*}  . 


* 

(P  ) 
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§ V.  Comments  . 


The  notion  of  a Leyrangian  submanifold  is  central  to  the  theory  of 
Fourier  integral  operators.  It  is  attributed  to  V.  Arnold  [ 1 J or  V.  Maslov  (13), 
and  has  been  painstakingly  investigated  ((  11 1,  [16[,  [9|).  However, 

these  authors  define  a Lagrangian  submanifold  of  a symplectic  manifold 

! 

(dimension  2n,  fundamental  2-form  ff)  as  an  n-dimensional  sub- 


manifold on  which  Q pulls  back  to  zero.  In  our  framework,  this  would 

,n  ...  ^n 


n 

_ V 


mean  an  n-dimensional  submanifold  of  IR  x ]R  on  which  ^ dp.  - dx^ 

n i = l 

pulls  back  to  zero-  Noting  w = dz  - ^ p.dx.  as  in  (1.  3),  we  see  that 

i = 1 1 1 

U = du>.  It  follows  that  if  V c iRn  y iRn  x IR  is  a Lagrangian  submanifold 

in  the  sense  of  definition  l.l,  if  the  projection  it  : V -*•  IR  x ]R  is 

xp 

proper  and  if  its  tangent  map  Tir  : TV  -*  T(IRn  X IR  ) has  rank  n 

everywhere,  then  tt  V is  a Lagrangian  submanifold  of  IRn  x lRn  in  the 

xp 

preceding  sense.  Our  definition  has  the  advantage  of  incorporating  z, 

which  is  very  useful  for  practical  purposes. 

For  basic  information  about  proper  maps,  we  refer  to  any  book  on 

00 

general  topology,  e.g.  [4).  Sard's  theorem  in  the  C case,  as  well 
as  basic  information  on  submanifolds  and  the  implicit  function  theorem, 
can  be  found  in  [12j. 

The  definition  (2.1)  of  the  Legendre  transformation  is  given  in  ( 6 ] 
as  a particular  case  of  a contact  transformation.  The  contact  transformation 
associated  with  a given  C function  H(x,  z;x',  z')  on  IR  xIRxIR  X1R 
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is  the  mapping  which  associates  with  any  point  (x,  p,  z)  t IR  x IR  x IR 
the  point  (x'^'.z1)  defined  by  the  formulae: 

H(x',  z';x,  z)  = 0 
< 9H/9x'  + p'9H/9z'  = 0 

9H/9x  + p 9H/9z  = 0 . 

Trom  the  two  last  equations  it  follows  (formally)  that  p = 9z/9x 
and  p'  = 9z'/9x'.  It  follows  (still  formally)  from  the  first  one  that 
dz'  + p'dx'  = 0 if  and  only  if  dz  + pdx  = 0.  In  other  words,  if  we 
have  no  trouble  with  cusps  or  closedness,  a contact  transformation  will 
send  a Lagrangian  manifold  onto  a Lagrangian  manifold.  It  need  not  be 
involutive.  In  the  special  case  where  H(x',  z';x,  z)  = z + z'  - xx',  we 
get  the  Legendre  transformation. 

Also  related  to  the  Legendre  transform  is  the  notion  of  dual  varieties 
in  algebraic  geometry.  Let  a projective  variety  C be  given  by  its 
equation  P(X  , . . . , X ) = 0,  where  P is  a homogeneous  polynomial 

a 

of  degree  d.  The  dual  variety  C is  the  set  of  tangents  to  C;  its 

equation  P(u^,  . . . , u^)  = 0 has  as  zeroes  all  (Uj,  . . . , u^)  such  that  the 

s 

hyperplane  u X,  + • • • + u X is  tangent  to  C.  In  particular,  C = C. 

11  n n x v 

n , n+1  L 

For  instance,  if  f : IR  -*  IR  is  a polynomial,  setting  z = x ’ Xi  ~ X 


n + 2 


n+2 


as  is  usual  in  projective  geometry  yields: 

.X 


graph  f 


={<x x^feH 


X X 

!_  n 

v ’ • • • >x 


n+2 


n+2  I 
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The  dual  variety  is  simply  the  graph  of  the  Legendre  transform: 

graph"*?  = graph  j!f  . 

A particularly  interesting  case  arises  when  n = 1 and  complex 

A 

numbers  are  used.  It  can  be  shown  that,  if  C (resp.  C)  is  a complex 

^ A 

algebraic  curve  of  degree  d (resp.  d),  having  r (resp.  r)  double  points 

a 

and  s (resp.  s)  cusps,  with  no  other  singularities,  then  we  have  the 
following  symmetric  relationship  (Plucker’s  formulae): 

r - 

d = d(d  - 1)  - 2r  - 3s 
\ d = d(d  - 1)  - 2r  - 3s 
s - s = 3(d  - d)  . 

I am  indebted  to  P.  Deligne  for  this  elementary  algebraic  geometry. 
Now  let  us  proceed  to  providing  Sections  II,  III,  IV  with  bibliographical 
references. 

Fundamentals  of  convex  analysis  are  given  in  [14]  or  [ 8).  Modern 
tools  of  differential  topology,  included  the  Malgrange  division  theorem, 
Thom's  transversality  theorem  and  notions  on  stratifications,  will  be 
found  in  ( IS);  see  [10]  for  a textbook  on  the  subject.  Note  that  the  proof 

oo 

of  Proposition  2.6  for  n - 1 does  not  require  the  C division  theorem. 

Condition  (3.7)  can  be  interpreted  as  a necessary  condition  for 
optimality  in  a much  broader  context  than  indicated,  i.e.  the  space  needs 
not  be  finite-dimensional  and  the  gj  need  not  be  linearly  independent: 
see  | 7 ].  Duality  theory  for  finite-dimensional  convex  optimization 
problems  will  be  found  in  [14]. 
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Theorem  4.1  is  due  to  J. -P.  Aubin.  Its  proof  will  be  found  in  [ 2 ] 
or  [ 3).  Duality  theory  for  convex  problems  in  the  calculus  of  variations 
is  treated  in  [8],  but  here  we  follow  rather  the  approach  of  [ 3).  Proposi- 
tion 4.  5 is  a nonconvex  analogue  of  [ 5).  I am  indebted  to  R.  Temam 
for  suggesting  to  me  the  eigenvalue  examples  concluding  Sections  III  and  IV. 


REFERENCES 


[ 1 ] V.  I.  Arnold,  Characteristic  class  entering  in  quantization  conditions, 

J.  Funct.  An.  Appl.  1 (1967),  pp.  1 - 1 3 . 

[2]  J. -P.  Aubin,  Approximation  of  elliptic  boundary-value  problems, 

Wiley  1972. 

f 3]  J.-P.  Aubin,  Mathematical  methods  of  game  and  economic  theory, 

North  Holland  Elsevier,  to  appear. 

[4)  N.  Bourbaki,  Topologie  generale,  Chapitres  1 et  2,  Zeme  edition, 

Hermann. 

[ 5]  H.  Brezis  et  I.  Ekeland,  Un  principe  variationnel  associe  a certaines 
equations  paraboliques,  Comptes  Rendus  Ac.  Sci.  Paris,  282  (197  6), 
pp.  971-974  and  1197-1198. 

[6]  C.  Caratheodory,  Variations  rechnung  und  partielle  differentialgleichungen 
erster  Ordnung,  Teubner,  19  35. 

1 7 ) F.  Clarke,  A new  approach  to  Lagrange  multipliers,  Mathematics 
of  Operations  Research  l (197  6) 

[8)  I.  Ekeland  and  R.  Temam,  Convex  analysis  and  variational  problems, 
North-Holland  Elsevier,  197  5. 

(9  | J.  Guckenheimer,  Catastrophes  and  partial  differential  equations, 

Ann.  Institut  Fourier,  Grenable,  2 3 (1  97  3),  2,  pp.  31-59. 

[ 10]  M.  Golublstsky  and  V.  Guillemin,  Stable  mappings  and  their 
singularities,  Springer  Verlag,  197  3. 

t 

[ 1 1 ] L.  Hormander,  Fourier  integral  operators  1,  Acta  Math.  127  (1971), 
pp.  79-1  83  (especially  section  3). 

-55- 

0dm ..  ? — 


[ 1 2 ] S.  Lang,  Dtfferentiabie  manifolds,  Addison-Wesley,  1972. 

0-t 

[ 13]  V.  Maslov,  Theory  of  perturbations  and  asymptotic  methods,  Moskov. 
Gos.  Univ.,  Moscow,  1 965  (Russian). 

| 14]  R.  T.  Rockafellar,  Convex  analysis,  Princeton  University  Press,  1969- 
[ 1 5]  Wall  ed.,  Proceeding  of  the  Liverpool  singularities  symposium  I, 
Springer  Lecture  Notes  in  Mathematics  192,  1971. 

[ 16]  A.  Weinstein,  Symplectic  manifolds  and  their  Lagrangian  submanifolds, 
Advances  in  Math.  6 (1971),  pp.  329-346. 


| 

* 


-56- 


r 


UNCLASSIFIED 


SECURITY  CLASSIFICATION  or  This  PAGE  Dm  Knitted) 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  fORM 
recipient's  catalog  humiu 


REPORT  DOCUMENTATION  PAGE 


OOVT  ACC  C It  ION 


RIOO  COVERED 

no  specific 


DUALITY  IN  JjJONCONVEX  cf'TIMIZATION  AND 
CALCULUS  OF  VARIATIONS  / 


ummary  Rep«rt 


* PERFORMING  ORG  REPORT  NUMBER 


DAAG29  -75-C 


Ivar/Ekeland 


10  program  element,  project,  task 


* PERFORMING  organization  name  ano  address 

Mathematics  Research  Center,  University  of  / 
610  Walnut  Street  Wisconsin 

Madison.  Wisconsin  53706 

II  CONTROLLING  OFPICE  NAME  AND  AOORES1 

U.  S.  Army  Research  Office 
P.O.  Box  12211 

Research  Triangle  Park,  North  Carolina  277  09 

"Tj  mONITONingTgEnCy  name  • AOORESS<T/  dtltoronl  from  Controlling  Otllco) 


IS  SECURITY  CLASS  (ol  Ihl • roport) 


UNCLASSIFIED 


IS*.  OECL  ASSI  PIC  ATI  ON^  DOWNGRADING 
SCHEDULE 


t DISTRIBUTION  STATEMENT  (ol  thlo  R»por(J 


Approved  for  public  release;  distribution  unlimited 


IS  KEY  *ORUS  'Contlnuo  on  rororoo  o'do  1 1 no  oooory  mid  Idonllty  hr  Wot*  ni^bor, 

dual  problem  calculus  of  varia 

Euler  equation  Legendre  transfor 

Lagrangian  submanifold 
optimization  problem 


10  AlSTfl  AC?  (Continue  an  revere*  elde  It  neieeemry  msd  Identity  by  bio  ei  number) 

general  duality  theory  is  given  for  smooth  nonconvex  optimization 
problems,  covering  both  the  finite-dimensional  case  and  the  calculus  of  varia- 
tions. The  results  are  quite  similar  to  the  convex  case;  in  particular,  with 
every  problem  (p)  is  associated  a dual  problem  IP  ) having  opposite  value., 
This  is  done  at  the  expense  of  broadening  the  framework,  from  smooth  functions 


IR  to  Lagrangian  submanifolds  of  IR  x IR  x 1R 


EDITION  OF  I NOV  tt  it  OBSOLETE 


UNCLASSIFIED  ' 

ftCUNlTV  CL  AEStPlC  ATlON  OP  Tm»S  *AGt 


£>•(•  Entered) 


